AITopics | latent size

Collaborating Authors

latent size

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Multivariate Variational Autoencoder

Yavuz, Mehmet Can

arXiv.org Artificial IntelligenceDec-2-2025

Learning latent representations that are simultaneously expressive, geometrically well-structured, and reliably calibrated remains a central challenge for Variational Autoencoders (VAEs). Standard VAEs typically assume a diagonal Gaussian posterior, which simplifies optimization but rules out correlated uncertainty and often yields entangled or redundant latent dimensions. We introduce the Multivariate Variational Autoencoder (MVAE), a tractable full-covariance extension of the VAE that augments the encoder with sample-specific diagonal scales and a global coupling matrix. This induces a multivariate Gaussian posterior of the form $N(μ_ϕ(x), C \operatorname{diag}(σ_ϕ^2(x)) C^\top)$, enabling correlated latent factors while preserving a closed-form KL divergence and a simple reparameterization path. Beyond likelihood, we propose a multi-criterion evaluation protocol that jointly assesses reconstruction quality (MSE, ELBO), downstream discrimination (linear probes), probabilistic calibration (NLL, Brier, ECE), and unsupervised structure (NMI, ARI). Across Larochelle-style MNIST variants, Fashion-MNIST, and CIFAR-10/100, MVAE consistently matches or outperforms diagonal-covariance VAEs of comparable capacity, with particularly notable gains in calibration and clustering metrics at both low and high latent dimensions. Qualitative analyses further show smoother, more semantically coherent latent traversals and sharper reconstructions. All code, dataset splits, and evaluation utilities are released to facilitate reproducible comparison and future extensions of multivariate posterior models.

artificial intelligence, machine learning, mv ae, (16 more...)

arXiv.org Artificial Intelligence

2511.07472

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Graph-Convolutional-Beta-VAE for Synthetic Abdominal Aorta Aneurysm Generation

Fabbri, Francesco, Scarpolini, Martino Andrea, Iollo, Angelo, Viola, Francesco, Tudisco, Francesco

arXiv.org Artificial IntelligenceJun-18-2025

Synthetic data generation plays a crucial role in medical research by mitigating privacy concerns and enabling large-scale patient data analysis. This study presents a beta-Variational Autoencoder Graph Convolutional Neural Network framework for generating synthetic Abdominal Aorta Aneurysms (AAA). Using a small real-world dataset, our approach extracts key anatomical features and captures complex statistical relationships within a compact disentangled latent space. To address data limitations, low-impact data augmentation based on Procrustes analysis was employed, preserving anatomical integrity. The generation strategies, both deterministic and stochastic, manage to enhance data diversity while ensuring realism. Compared to PCA-based approaches, our model performs more robustly on unseen data by capturing complex, nonlinear anatomical variations. This enables more comprehensive clinical and statistical analyses than the original dataset alone. The resulting synthetic AAA dataset preserves patient privacy while providing a scalable foundation for medical research, device testing, and computational modeling.

artificial intelligence, bayesian inference, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2506.13628

Country: Europe (1.00)

Genre: Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.47)
Health & Medicine > Diagnostic Medicine > Imaging (0.46)
Health & Medicine > Health Care Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Scalable Language Models with Posterior Inference of Latent Thought Vectors

Kong, Deqian, Zhao, Minglu, Xu, Dehong, Pang, Bo, Wang, Shu, Honig, Edouardo, Si, Zhangzhang, Li, Chuan, Xie, Jianwen, Xie, Sirui, Wu, Ying Nian

arXiv.org Machine LearningFeb-3-2025

We propose a novel family of language models, Latent-Thought Language Models (LTMs), which incorporate explicit latent thought vectors that follow an explicit prior model in latent space. These latent thought vectors guide the autoregressive generation of ground tokens through a Transformer decoder. Training employs a dual-rate optimization process within the classical variational Bayes framework: fast learning of local variational parameters for the posterior distribution of latent vectors, and slow learning of global decoder parameters. Empirical studies reveal that LTMs possess additional scaling dimensions beyond traditional LLMs, yielding a structured design space. Higher sample efficiency can be achieved by increasing training compute per token, with further gains possible by trading model size for more inference steps. Designed based on these scaling properties, LTMs demonstrate superior sample and parameter efficiency compared to conventional autoregressive models and discrete diffusion models. They significantly outperform these counterparts in validation perplexity and zero-shot language modeling. Additionally, LTMs exhibit emergent few-shot in-context reasoning capabilities that scale with model and latent size, and achieve competitive performance in conditional and unconditional text generation.

large language model, machine learning, natural language, (15 more...)

arXiv.org Machine Learning

2502.01567

Country:

Europe (0.46)
North America > United States > Texas (0.28)

Genre: Research Report (0.64)

Industry:

Leisure & Entertainment > Sports > Football (1.00)
Law (0.93)
Health & Medicine (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Add feedback

Studying the Impact of Latent Representations in Implicit Neural Networks for Scientific Continuous Field Reconstruction

Xu, Wei, DeSantis, Derek Freeman, Luo, Xihaier, Parmar, Avish, Tan, Klaus, Nadiga, Balu, Ren, Yihui, Yoo, Shinjae

arXiv.org Artificial IntelligenceApr-9-2024

Learning a continuous and reliable representation of physical fields from sparse sampling is challenging and it affects diverse scientific disciplines. In a recent work, we present a novel model called MMGN (Multiplicative and Modulated Gabor Network) with implicit neural networks. In this work, we design additional studies leveraging explainability methods to complement the previous experiments and further enhance the understanding of latent representations generated by the model. The adopted methods are general enough to be leveraged for any latent space inspection. Preliminary results demonstrate the contextual information incorporated in the latent representations and their impact on the model performance. As a work in progress, we will continue to verify our findings and develop novel explainability approaches.

latent space, original data, representation, (13 more...)

arXiv.org Artificial Intelligence

2404.06418

Country:

Africa > Senegal > Kolda Region > Kolda (0.05)
North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)

Add feedback

Continuous Field Reconstruction from Sparse Observations with Implicit Neural Networks

Luo, Xihaier, Xu, Wei, Ren, Yihui, Yoo, Shinjae, Nadiga, Balu

arXiv.org Artificial IntelligenceJan-21-2024

Reliably reconstructing physical fields from sparse sensor data is a challenge that frequently arises in many scientific domains. In practice, the process generating the data often is not understood to sufficient accuracy. Therefore, there is a growing interest in using the deep neural network route to address the problem. This work presents a novel approach that learns a continuous representation of the physical field using implicit neural representations (INRs). Specifically, after factorizing spatiotemporal variability into spatial and temporal components using the separation of variables technique, the method learns relevant basis functions from sparsely sampled irregular data points to develop a continuous representation of the data. In experimental evaluations, the proposed model outperforms recent INR methods, offering superior reconstruction quality on simulation data from a stateof-the-art climate model and a second dataset that comprises ultra-high resolution satellite-based sea surface temperature fields. Achieving accurate and comprehensive representation of complex physical fields is pivotal for tasks spanning system monitoring and control, analysis, and design. However, in a multitude of applications, encompassing geophysics (Reichstein et al., 2019), astronomy (Gabbard et al., 2022), biochemistry (Zhong et al., 2021), fluid mechanics (Deng et al., 2023), and others, using a sparse sensor network proves to be the most practical and effective solution. In meteorology and oceanography, variables such as atmospheric pressure, temperature, salinity/humidity, and wind/current velocity must be reconstructed from sparsely sampled observations. Currently, two distinct approaches are used to reconstruct full fields from sparse observations. Traditional physics model-based approaches are based on partial differential equations (PDEs). These approaches draw upon theoretical techniques to derive PDEs rooted in conservation laws and fundamental physical principles (Hughes, 2012). Yet, in complex systems such as weather (Brunton et al., 2016) and epidemiology (Massucci et al., 2016), deriving comprehensive models that are both sufficiently accurate and computationally efficient remains elusive.

conference paper, latent variable, latexit sha1, (16 more...)

arXiv.org Artificial Intelligence

2401.11611

Country: North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)

Genre: Research Report (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Energy (1.00)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Adaptive Neural Networks Using Residual Fitting

Ford, Noah, Winder, John, McClellan, Josh

arXiv.org Artificial IntelligenceJan-13-2023

Current methods for estimating the required neural-network size for a given problem class have focused on methods that can be computationally intensive, such as neural-architecture search and pruning. In contrast, methods that add capacity to neural networks as needed may provide similar results to architecture search and pruning, but do not require as much computation to find an appropriate network size. Here, we present a network-growth method that searches for explainable error in the network's residuals and grows the network if sufficient error is detected. We demonstrate this method using examples from classification, imitation learning, and reinforcement learning. Within these tasks, the growing network can often achieve better performance than small networks that do not grow, and similar performance to networks that begin much larger.

artificial intelligence, machine learning, neural network, (16 more...)

arXiv.org Artificial Intelligence

2301.05744

Country: North America > United States > Maryland > Prince George's County > Laurel (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Temporal Weights

Kohan, Adam, Rietman, Ed, Siegelmann, Hava

arXiv.org Artificial IntelligenceDec-13-2022

In artificial neural networks, weights are a static representation of synapses. However, synapses are not static, they have their own interacting dynamics over time. To instill weights with interacting dynamics, we use a model describing synchronization that is capable of capturing core mechanisms of a range of neural and general biological phenomena over time. An ideal fit for these Temporal Weights (TW) are Neural ODEs, with continuous dynamics and a dependency on time. The resulting recurrent neural networks efficiently model temporal dynamics by computing on the ordering of sequences, and the length and scale of time. By adding temporal weights to a model, we demonstrate better performance, smaller models, and data efficiency on sparse, irregularly sampled time series datasets.

artificial intelligence, machine learning, temporal weight, (17 more...)

arXiv.org Artificial Intelligence

2301.04126

Country: North America > United States > Massachusetts > Hampshire County > Amherst (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Health Care Providers & Services (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Generating Images

#artificialintelligenceJun-20-2021, 08:40:30 GMT

The idea is to generate the same image through the model for a given sample image. The application is to use the model architecture and complete the occluded(half-filled) image. A basic encoder-decoder and deep CNN encoder-decoder models are implemented from scratch, trained, and analysed on three datasets. The Analysis is also on finding a good size hidden representation of the image for every dataset, which can be used for applications. Some well-known approaches for image generation are Autoencoders, Generative Adversarial Networks(GANs), Auto-Regressive models(PixelRNN, PixelCNN), DRAW.

latent size, occluded image, test data, (14 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback